Using accent-specific pronunciation modelling for improved large vocabulary continuous speech recognition
نویسندگان
چکیده
A method of modelling accent-specific pronunciation variations is presented. Speech from an unseen accent group is phonetically transcribed such that pronunciation variations may be derived. These context-dependent variations are clustered in decision trees which are used as a model of the pronunciation variation associated with this new accent group. The trees are then used to build a new pronunciation dictionary for use during the recognition process. Experiments are presented, based on Wall Street Journal and WSJCAM0 corpora, for the recognition of American speakers using a British English recogniser. Speaker independent as well as speaker dependent adaptation scenarios are presented, giving up to 20% reduction in word error rate. A linguistic analysis of the pronunciation model is presented and finally the technique is combined with maximum likelihood linear regression, a well proven acoustic adaptation technique, yielding further improvement.
منابع مشابه
Acoustic and Lexical Modeling Techniques for Accented Speech Recognition
Speech interfaces are becoming pervasive among the common public with the prevalence of smart phones and cloud-based computing. This pushes Automatic Speech Recognition (ASR) systems to handle wide range of environments including different channels, noise conditions and speakers with varying accents. This thesis focuses on the impact of speakers’ accents on the ASR models and techniques to make...
متن کاملUsing accent-specific pronunciation modelling for robust speech recognition
A method of modelling accent-specific pronunciation variations is presented. Speech from an unseen accent group is phonetically transcribed such that pronunciation variations may be derived. These context-dependent variations are clustered in a decision tree which is used as a model of the pronunciation variation associatedwith this new accent group. The tree is then used to build a new pronunc...
متن کاملPronunciation Modelling in the Rwth Large Vocabulary Speech Recognizer
In this paper we describe the application of pronunciation variants for our large vocabulary continuous speech recognizer. We will explain how the pronunciation variants were used in training and recognition and give some recognition results on three different corpora. The recognition tests were performed on the Wall Street Journal (WSJ) November 92 development and evaluation corpora (5 000 wor...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملThe SpeeD Grammar-based ASR System for the Romanian Language
This paper describes the grammar-based automatic speech recognition system for the Romanian language developed by the Speech and Dialogue Research Group. The paper links to previous work for the issues related to large vocabulary speech recognition and focuses on the specific optimization work done for several closed-vocabulary, grammar-based speech recognition tasks. Among the specific problem...
متن کامل